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On  the  Geometry  of  Texture 


Ron  Kimmel,  Nir  A.  Sochen,  and  Ravi  Malladi 


Abstract.  We  consider  texture  images  as  a composition  of  manifolds  in 
the  feature-space.  This  geometrical  interpretation  leads  to  a natural  way 
for  texture  enhancement.  A flow,  based  on  manifold  volume  minimiza- 
tion, yields  a natural  enhancement  procedure  for  texture  images.  The  2D 
Gabor-Morlet  transform  is  first  used  to  decompose  the  image  into  sub- 
band images,  where  each  sub-image  corresponds  to  a different  scale.  Each 
sub-band  image  may  be  considered  as  a 3D  manifold  in  a 5D  space  from 
which  the  original  image  can  be  reconstructed  in  a numerically  stable  way. 
Following  our  previous  results,  we  then  invoke  Polyakov  action  from  String 
Theory,  and  develop  a minimization  process  through  a geometric  flow  that 
efficiently  enhances  each  sub-band  image  in  a spatial-orientation  feature 
space.  Finally,  the  enhanced  sub-band  images  are  composed  back  into  an 
enhanced  texture  image. 


§1.  Introduction 

Texture  plays  an  important  role  in  the  understanding  process  of  many  im- 
ages. Therefore,  it  became  an  important  research  subject  in  the  fields  of 
psychophysics  and  computer  vision.  The  study  of  texture  starts  from  the  pre- 
image that  describes  the  physics  and  optics  that  transforms  the  3D  world  into 
an  image,  through  human  perception  that  starts  from  the  image  formation 
on  the  retina  and  tracks  its  interpretation  at  the  first  perception  steps  in  the 
brain. 

The  psychophysical  research  of  these  first  steps  focuses  on  the  way  the 
brain  cells  are  activated  under  the  stimulus  of  a given  image.  Such  experi- 
ments combined  with  recent  developments  in  the  field  of  signal  representation 
led  to  relatively  simple  mathematical  models  that  simulate  the  first  steps 
in  the  way  our  brain  represents  images.  One  such  model  is  based  on  the 
2D  Gabor/Morlet-wavelet  transform  of  the  image.  Some  nice  mathematical 
properties  and  the  relation  of  this  transform  to  the  physiological  behavior  were 
studied  in  [6,10].  This  model  was  used  for  the  segmentation,  interpretation 
and  analysis  of  texture  [2,7],  for  texture  based  browsing  [8],  etc. 
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In  this  paper  we  use  the  same  space  to  represent  texture  images.  Then, 
we  search  for  a geometrical  way  to  improve  and  enhance  texture  based  im- 
ages. The  geometrical  feature  enhancement  procedure  we  introduce  may  serve 
as  a step  towards  segmentation.  This  procedure  is  based  on  a flow  in  the 
transformed  space  in  which  the  transform  coefficients  are  treated  as  higher 
dimensional  manifolds.  A special  minimization  process  preserves  domains  of 
constant/homogeneous  texture,  enhances  the  texture  in  each  domain,  and 
thereby  sharpens  the  boundaries  between  neighboring  domains  with  different 
textures. 

The  remainder  of  this  paper  is  organized  as  follows:  Section  2 briefly 
reviews  our  previous  results:  the  definition  of  arclength,  the  consideration  of 
images  as  surfaces,  and  the  minimization  of  Polyakov  action  that  leads  to  a 
geometric  flow  we  named  the  Beltrami  flow.  Next,  Section  3 describes  the 
relevant  feature  space  to  the  texture  case.  It  gives  the  basics  for  constructing 
the  2D  Gabor-Morlet  wavelet  decomposition,  and  a simple  way  for  composing 
the  image  back.  Section  4 presents  experimental  results  of  the  Beltrami  flow 
in  the  decomposition  feature  space,  for  simple  gray  level  texture. 

§2.  Images  as  Embedded  Maps  that  Flow  Toward  Harmonic  Maps 

In  [11]  we  consider  images  as  2D  surfaces  in  higher  dimensional  spaces.  We 
construct  enhancement  and  segmentation  procedures  for  color  images  as  2D 
surfaces  in  5D  ( x , y,  r,  g,  b)  space.  As  shown  in  [4],  the  idea  of  images  as  curved 
spaces  is  not  limited  to  2D  surfaces,  so  that  movies  and  volumetric  images  can 
be  considered  as  3D  hypersurfaces  (manifolds)  in  4D  (x,y,z,I(x,y,z))  space. 

Our  geometric  framework  finds  a seamless  link  between  the  L\  norm, 
used  in  the  Osher-Rudin  TV  image  enhancement  and  its  variants,  and  the  L i 
norms,  used  in  Mumford-Shah  image  segmentation  and  its  variants.  TV  (To- 
tal Variation)  schemes  are  based  on  minimizing  the  Li  norm,  namely  f |V/|, 
while  the  L 2 norm  minimizes  f |V/|2.  Our  framework  is  based  on  the  ge- 
ometry of  the  image  and  its  interpretation  as  a surface.  The  aspect  ratio 
between  the  gray  level  and  the  xy  image  plane,  is  the  switch  between  the  two 
commonly  used  norms.  This  observation  made  it  possible  to  show  that  our 
multi-channel  (color)  enhancement  procedure  may  be  considered  as  a gener- 
alization of  the  powerful  TV  scheme  that  is  now  commonly  used  in  the  high 
tech  image  processing  industry.  This  procedure  yield  very  promising  results 
for  color  image  enhancement  [11].  In  this  work,  we  propose  a flow  in  a rich 
feature  space  which  is  different  from  the  image  spatial-intensity  space. 

Representation  and  Riemannian  structure 

We  represent  an  image  and  other  local  features  as  embedding  maps  of  a Rie- 
mannian manifold  in  a higher  dimensional  space.  The  simplest  example  is 
the  image  itself  which  is  represented  as  a 2D  surface  embedded  in  1R3.  We 
denote  the  map  by  X : E — > 1R3,  where  E is  a two-dimensional  surface,  and  we 
denote  the  local  coordinates  on  it  by  (a1 , <r2).  The  map  X is  given  in  general 
by  (W1(cr1,  cr2),  W2(<71,  cr2),  W3(<t1,  cr2)).  In  our  example  we  represent  it  as 
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(X1  = cr1,^2  = a2,X3  = J(ct1j  (72)).  We  choose  a Riemannian  structure  on 
this  surface,  namely,  a metric.  The  metric  is  a positive  definite  and  symmetric 
2-tensor  that  may  be  defined  through  the  local  distance  measurements: 

ds 2 = gv.vdav’dav  = guide1)2  + 2512d(71d<72  + glider2)2,  (1) 

where  we  used  Einstein  summation  convention  in  the  second  equality.  We 
denote  the  inverse  of  the  metric  by  g . 

Polyakov  action:  a measure  on  the  space  of  embedding  maps 

Let  us  briefly  review  our  general  framework  for  non-linear  diffusion  in 
computer  vision.  We  will  use  this  framework  in  Section  4 to  diffuse  a tex- 
tured image  in  the  transformed  domain.  The  equations  will  be  derived  by 
a minimization  problem  from  an  action  functional.  The  functional  in  ques- 
tion depends  on  both  the  image  manifold  and  the  embedding  space.  Denote 
by  (E,p)  the  image  manifold  and  its  metric,  and  by  ( M,h ) the  space-feature 
manifold  and  its  metric.  Then  the  functional  5[X]  attaches  a real  number  to 
a map  X : E — » M : 


S[Xi,gliU,hij}  = J dViVX^VX^ghij, 


(2) 


where  dV  is  a volume  element  and  {VR,  VG)g  = g^d^Rd^G.  This  functional, 
for  m = 2,  was  first  proposed  by  Polyakov  [9]  in  the  context  of  high  energy 
physics,  and  the  theory  is  known  as  string  theory. 

Using  standard  methods  in  the  calculus  of  variations  (see  [11]),  the  Euler- 
Lagrange  equations  with  respect  to  the  embedding  are 


P) 


Since  (gliL, ) is  positive  definite,  g = det(<^„)  > 0 for  all  cr1*.  This  factor  is 
the  simplest  one  that  doesn’t  change  the  minimization  solution  while  giving 
a reparameterization-invariant  expression.  The  operator  that  is  acting  on  X% 
is  the  natural  generalization  of  the  Laplacian  from  flat  spaces  to  manifolds 
and  is  called  the  second  order  differential  parameter  of  Beltrami  [5],  or  for  short 
Beltrami  operator,  and  we  will  denote  it  by  A9. 

For  a surface  E embedded  in  3 dimensional  Euclidean  space,  we  get  a 
minimal  surface  as  the  solution  to  the  minimization  problem.  In  order  to 
see  this  and  to  connect  to  the  usual  representation  of  the  minimal  surface 
equation,  we  notice  that  the  solution  of  the  minimization  problem  with  respect 
to  the  metric  is 

= d^dyXi.  (4) 

Plugging  this  induced  metric  in  the  first  Euler-Lagrange  equation  (3),  we  get 
the  steepest  decent  flow 


Xt  = RTS, 


(5) 
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where  H is  the  mean  curvature,  and  N is  the  normal  to  the  surface  given  by 

rr  (l+ll)Iyy-2IxIyIxy  + (l  + ll)Ixx 

“ 3 5 

. 55  (6) 
N = —p(  — Iy,  —III  1)T1 

and  g = 1 + 1%  + 1%.  We  see  that  this  choice  gives  us  the  mean  curvature  flow! 
This  should  not  be  a surprise,  since  the  action  functional  for  the  above  choice 
of  metric  g^v  is 


5 = y dArea  = j d2a^/g  = J d2ir^/det(9;jA't5^A',),  (7) 

which  is  the  Euler  functional  that  describes  the  area  of  the  surface,  also  known 
in  high  energy  physics  as  the  Nambu  action. 

In  general,  for  any  manifold  E and  A/,  the  map  X : E — + M that  min- 
imizes the  action  S with  respect  to  the  embedding  is  called  a harmonic  map. 
The  harmonic  map  is  the  natural  generalization  of  the  geodesic  curve  and  the 
minimal  surface  to  higher  dimensional  manifolds. 


§3.  Gabor/Morlet-wavelets:  A Natural  Space  for  Texture  Images 

In  [6]  Lee  argues  that  the  2D  Gabor/Morlet  wavelet  transform  with  specific 
coefficients  is  an  appropriate  mathematical  description  for  images.  He  based 
his  findings  on  neurophysiological  evidence  based  on  experiments  on  the  visual 
cortex  of  mammalian  brains.  These  experiments  indicate  that  the  best  model 
for  the  filter  response  of  simple  cells  are  self-similar  2D  Gabor/Morlet  wavelets. 

Following  Lee  [6],  let  us  briefly  describe  the  2D  Gabor/Morlet  wavelets 
that  model  the  simple  cells.  The  2D  wavelet  transform  on  an  image  I(x,y)  is 
defined  as 

{TwavI)(xo,yo,0,a)  = \\a\\~1  J J dxdy I {x,y)rpg  ^ , V , (8) 

where  a is  a dilation  parameter,  xo  and  y o are  the  spatial  translations,  and  6 
is  the  wavelet  orientation  parameter.  Here 

il>(x,y,x0,y0,e,a)  = ||a|rVfl(3'  J0  < V J0)’  (9) 

is  the  2D  elementary  wavelet  function  rotated  by  9.  Based  on  neurophysiolog- 
ical experiments,  a specific  Gabor  elementary  function  is  used  as  the  mother 
wavelet  to  generate  the  2D  Gabor/Morlet  wavelet  family  by  convolving  the 
image  with 

iP(x,y)  = _ c-4),  (10) 

v 
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(x\  _ ( cos 0 sin d\  f x\  _ , 

y)~  \-sin<9  cos  0 ) \y  ) ' 

The  discretization  of  (8),  i.e  (7J5““j  mI),  is  denoted  by  WP:qti)rn  and  is 
given  by 

WW,m  = a~m  j J dxdyI(x,y)-ipiAe(a~m(x  - pAx),a~m(y  - qAx)),  (12) 

where  A*  is  the  basic  sampling  interval,  A 9 = 2i r/L,  and  the  angles  are  given 
by  IA9,  where  l = 0, L — 1,  and  L is  the  total  number  of  orientations. 
p,  q and  m are  integers  determining  the  position  and  scaling.  Note  that  as 
m increases,  the  sample  intervals  get  larger  forming  a pyramidal  structure. 
Equation  (12)  can  be  read  as  a projection  onto  a discrete  set  of  basis  functions 

Hp,(7,/,Tn  — (13) 

The  real  number  k determines  the  frequency  bandwidth  of  the  filters  in 
octaves  via  the  approximation 

k = a^T= (14) 

where  <f>  is  the  bandwidth  in  octaves,  e.g.  for  a = 2 and  <j>  = 1.5  we  get 
k « 2.5.  In  the  above  approximation  the  DC  normalization  term  e~k  !2  that 
is  required  to  make  a wavelet  basis  out  of  the  Gabor  basis  is  ignored,  and 
we  consider  a = k/oj q.  So  the  peaks  of  the  scaled  mother  wavelets  in  the 
frequency  domain  are  (approximately)  at  the  locations  a~mw o- 

For  our  application  we  have  chosen  to  work  with  a frame.  The  concept 
of  frames  was  introduced  in  [3].  A family  of  functions  (iftj)  is  a frame  if  there 
exist  A > 0,  B < oo  that  are  called  frame  bounds  so  that  for  every  / we  have 

AH/II2  < ^|(/,V'J)|2  < S||/||2,  (15) 

j 

where  ||/||  = f f2.  One  could  recognize  this  as  a generalization  of  Parseval’s 
theorem.  A discrete  family  of  wavelets  that  forms  a frame  provides  a complete 
representation  of  any  function.  In  some  cases  it  is  possible  to  recover  a function 
with  good  approximation  by  the  inversion  formula 

/«  (16) 

j 

The  ratio  B/A  measures  the  tightness  of  the  frame.  When  A = B,  the  frame 
is  tight  and  the  reconstruction  by  summation  is  exact.  Thus,  as  B/A  ap- 
proaches 1,  we  may  still  use  the  above  reconstruction  equation  as  a good 
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Fig.  1.  The  wavelet  basis  functions  (up  to  translations).  The  basis  functions 
are  presented  in  a gray  level  array,  real  (symmetric)  and  imaginary  (a- 
symmetric)  for  the  8 angles  [0, 7 r]  and  5 scales. 


Fig.  2.  The  half  peak  contours  in  the  frequency  domain  of  the  wavelet  basis 
functions  in  the  previous  figure,  (5  scales  16  orientations). 


approximation.  That  is,  we  treat  our  discrete  wavelets  as  an  orthonormal 
basis. 

We  denote  the  2D  Gabor/Morlet- wavelet  transform  as  W(x,y,8,a),  such 
that  R = Real(W)  and  J = Imag(lF),  where  for  the  discrete  case  a = am  and 
9 = IA8.  The  response  of  a simple  cell  is  then  modeled  by  the  projection  of 
the  image  onto  a specific  Gabor/Morlet  wavelet. 

Motivated  by  the  arrangement  of  simple  cells  in  our  brain,  with  as  tight 
a frame  as  possible,  we  consider  5 spatial  frequency  octaves,  and  16  angles 
that  discretize  the  [0, 2ir]  angular  interval.  Practically,  we  used  the  symmetry 
properties  of  the  2D  Gabor/Morlet- wavelet  transform:  W(x,y,6  + 7r,cr)  = 
W(x,y,8,a).  Thus,  only  8 angles  are  needed  to  represent  the  discretization 
of  the  full  [0,  2n]  angular  interval  into  16.  We  choose  a = 2 and  As  = 1.  This 
selection  results  in  a frame  bounds  A = 271.95,  B = 233.69,  with  ratio  of 
B/A  = 1.19.  The  fact  that  this  ratio  is  close  to  1 means  that  we  have  a tight 
frame  that  allows  simple  summation  reconstruction.  Figs.  1 and  2 show  the 
basis  functions  we  used.  Periodic  boundary  conditions  are  used  for  the  real 
(symmetric)  part,  and  negative  periodicity  for  the  imaginary  part,  forming 
a ‘Klein  bottle’  coordinate  system  in  ( x,y,6 ).  This  enables  us  to  reduce  the 
memory  complexity  by  a factor  of  2. 


Geometry  of  Texture 


209 


Fig.  3.  A schematic  diagram  of  Gabor/Morlet  wavelet  decomposition  of  the  orig- 
inal image  (at  the  top)  into  the  ( x,y,9 , Wa  (x,  y,  0))  and  the  images  that 
are  the  result  of  reconstruction  by  summation  for  each  scale  cr  separately 
(bottom).  The  last  row  presents  the  reconstruction  result  after  70  iter- 
ation of  the  Beltrami  flow  at  each  scale.  In  all  the  examples  we  use 
L = 16,  a = 2,  k = 2.5,  and  m G {0, ..,  4}. 


For  practical  implementation  that  avoids  the  special  numerical  treatment 
needed  along  the  pyramidal  discrete  cr  scale  axis,  we  consider  each  scale  as  a 
separate  space.  The  induced  metric  for  each  scale  is  then  given  by 


/ 1 + R%  + jf  RxRy  + JxJy 
{duu)  = I RxRy  + JxJy  1 + Ry  + Jy 
y Rx  R()  T JXJ$  Ry  Rff  T <JyJ$ 


R2  + Je 


(17) 
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This  result  can  be  understood  from  the  arclength  definition  in  this  spatial- 
orientation  complex  space,  namely 

ds 2 = dx 2 + dy2  + dO2  + dJ 2 + dR2.  (18) 


Applying  the  chain  rule  on  dR  = Rxdx+Rydy  + Rgd9,  and  similarly  for  dJ,  we 
obtain  the  desired  bilinear  structure  that  describes  the  above  induced  metric 
for  this  case. 

The  gradient  descent  equations  for  the  Polyakov  action  read 

Rt  = AgR,  Jt  — A9J,  (19) 

where  AgX  is  given  in  (3)  with  the  metric  (17). 


§4.  Experimental  Results 

Let  us  start  with  a simple  example.  In  Fig.  3 we  first  decompose  an  image  via 
the  wavelet  transform  into  4 separate  sub-scale  channels.  The  decomposition 
and  the  result  of  applying  the  Beltrami  flow  on  each  sub-scale  are  shown. 

Let  us  gain  more  motivation  on  the  advantage  of  the  wavelet  decomposi- 
tion. Fig.  4 shows  the  result  of  composing  the  image  back  from  just  the  first 
2,  and  then  the  first  3 sub-scale  channels.  The  cancellation  of  the  shadowing 
can  also  be  realized  by  a very  simple  high  pass  filter.  However,  as  a byproduct 
of  the  wavelet  decomposition,  at  each  scale  a we  now  have  the  complex  func- 
tion Wa(x,  y,  9).  It  defines  a surface  in  the  5D  space  (3  real  and  one  complex 
dimensions)  (x,y,9,W„).  The  extra  coordinate  6 that  describes  the  behavior 
of  the  image  along  a specific  direction  enables  us  to  smooth  the  image  while 
keeping  the  meaningful  orientation  structure  of  the  texture.  Moreover,  we 
have  the  freedom  to  apply  different  filters  to  the  different  scales.  This  enables 
us  to  preserve  the  nature  of  texture  images  by  processing  them  only  at  signif- 
icant scales.  In  other  words,  we  are  able  to  sharpen  a specific  scale  without 
effecting  the  rest  of  the  sub-band  images.  Fig.  5 is  the  original  image  and 
the  result  of  applying  the  Beltrami  flow  to  filter  out  non-oriented  structures. 
More  examples  are  shown  in  Fig.  6. 


§5.  Concluding  Remarks 

We  proposed  to  combine  a psychophysically  supported  texture  space,  the  2D 
Gabor/Morlet- wavelet  transform,  with  a geometrical  flow  to  enhance  texture 
images.  The  texture  was  considered  as  a manifold  in  its  natural  space.  The 
flow  was  realized  by  invoking  Polyakov  action,  and  the  result  was  the  Beltrami 
flow  in  the  feature  space.  The  result  is  a variational-geometric  technique  that 
enhances  texture  images  in  their  appropriate  decomposition  space. 
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Fig.  4.  Reconstruction  by  summation,  of  only  2,  3,  and  all  layers  of  the  different 
scales:  the  low  frequency  scale  contribute  the  shadowing,  thus  summing 
only  over  the  first  3 scales  cancels  this  effect  (a  simple  high  pass  effect). 


Fig.  5.  Left:  Original  image  128  X 128,  Right:  Result  of  Beltrami  flow  for  70 
numerical  iterations  in  each  sub-scale. 


Fig.  6.  Example  of  2 snapshots  from  the  evolution  for  different  texture  images 
Left:  Original  image  64  x 64. 
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